1 00:00:40,040 --> 00:00:42,520 On completion of this training sequence, 2 00:00:42,520 --> 00:00:46,540 you will be able to understand the different quality assurance systems, 3 00:00:46,540 --> 00:00:50,770 as well as the quality assurance criteria, used in the market. 4 00:00:50,770 --> 00:00:53,900 Finally you will learn how to reach MARS, 5 00:00:53,900 --> 00:00:58,090 or your capacity to produce accurate and rapid live subtitles. 6 00:00:59,580 --> 00:01:01,950 This is the agenda of this presentation. 7 00:01:09,640 --> 00:01:11,820 When dealing with quality assurance, 8 00:01:11,820 --> 00:01:18,060 the four main recurrent criteria mentioned are accuracy, errors, speed and delay. 9 00:01:18,060 --> 00:01:19,940 When it comes to accuracy, 10 00:01:19,940 --> 00:01:23,270 various approaches to subtitling are mentioned, 11 00:01:23,270 --> 00:01:27,070 each corresponding to specific strategies to apply, 12 00:01:27,070 --> 00:01:32,540 that may be exclusive to that approach or transversal to more than one. 13 00:01:32,540 --> 00:01:35,180 These approaches are the litteratim one, 14 00:01:35,180 --> 00:01:38,650 or transcription of each sound of a given text; 15 00:01:38,650 --> 00:01:41,780 verbatim, or transcription of each word; 16 00:01:41,780 --> 00:01:45,650 sensatim, or the rendition of each meaning; 17 00:01:45,650 --> 00:01:50,830 and signatim, or the rendition of each non-verbal sign. 18 00:01:50,830 --> 00:01:56,040 Depending on the notion of accuracy requested by the client or the user, 19 00:01:56,040 --> 00:01:59,370 our approach to subtitling will change. 20 00:01:59,370 --> 00:02:04,060 The most common ones are the verbatim and the sensatim approach. 21 00:02:09,970 --> 00:02:12,800 When assessing the quality of live subtitles, 22 00:02:12,800 --> 00:02:18,180 the first thing that hearing people notice is that the subtitles are delayed 23 00:02:18,180 --> 00:02:22,870 compared to the actual utterance of a given sentence. 24 00:02:22,870 --> 00:02:26,170 This may sound quite banal to people in the field, 25 00:02:26,170 --> 00:02:30,860 but this is something I always find myself explaining to clients. 26 00:02:32,570 --> 00:02:36,040 There are basically two ways of counting delay, 27 00:02:36,040 --> 00:02:40,890 and this depends on the approach used to subtitle a speech. 28 00:02:40,890 --> 00:02:43,260 In case of verbatim subtitles, 29 00:02:43,260 --> 00:02:46,760 delay is counted from the spoken word being uttered 30 00:02:46,760 --> 00:02:50,520 to the same word appearing on screen in written form. 31 00:02:51,380 --> 00:02:55,940 In case of sensatim subtitles or of interlingual subtitles, 32 00:02:55,940 --> 00:02:58,380 delay is to be counted differently. 33 00:02:58,380 --> 00:03:03,720 In particular, subtitles are counted from the concept being uttered by the speaker 34 00:03:03,720 --> 00:03:07,980 to the same concept appearing on screen as written words. 35 00:03:07,980 --> 00:03:14,680 If subtitles appear one word after the other as snake subtitles or roll-up subtitles, 36 00:03:14,680 --> 00:03:20,120 delay is calculated between the average time of the spoken concept, 37 00:03:20,120 --> 00:03:24,220 that is between the first and the last word being spoken, 38 00:03:24,220 --> 00:03:27,850 and the average time of the corresponding written concept, 39 00:03:27,850 --> 00:03:32,300 between the first and the last word being written down. 40 00:03:32,300 --> 00:03:35,770 If subtitles appear as block subtitles, 41 00:03:35,770 --> 00:03:40,980 delay is calculated between the last word of an uttered concept 42 00:03:40,980 --> 00:03:45,730 to the time the block with the same concepts appears on screen. 43 00:03:52,830 --> 00:03:56,590 Again, depending on the approach to adopt when subtitling, 44 00:03:56,590 --> 00:04:01,010 various taxonomies to assess the quality of the subtitles exist. 45 00:04:01,010 --> 00:04:05,140 When it comes to verbatim subtitles, two main models are used: 46 00:04:05,140 --> 00:04:10,780 the traditional word error rate, or WER, and the NER model. 47 00:04:11,640 --> 00:04:15,000 The NER model is widely used in the profession. 48 00:04:15,000 --> 00:04:17,810 It consists in the following formula: 49 00:04:17,810 --> 00:04:22,000 total number of words minus edition and recognition errors. 50 00:04:22,000 --> 00:04:24,840 The total is divided by the number of subtitled words 51 00:04:24,840 --> 00:04:27,210 and multiplied by 100. 52 00:04:27,210 --> 00:04:30,780 Let’s take the sentence in the slide as an example. 53 00:04:30,780 --> 00:04:36,520 «Well, you know, you have to try and put out a good performance. 54 00:04:36,520 --> 00:04:39,950 I mean, yeah, it’s kinda stepping stone, ain’t it?». 55 00:04:41,300 --> 00:04:47,810 Let’s pretend the subtitles get rid of features of orality and read: 56 00:04:47,810 --> 00:04:52,130 «You must try to put out a good performance. It’s a stepping stone». 57 00:04:52,130 --> 00:04:58,700 In this case, if we apply the NER model, the quality of the subtitles is 100% 58 00:04:58,700 --> 00:05:04,800 because all missing or modified words are omitted or modified for the best. 59 00:05:05,790 --> 00:05:09,920 When it comes to sensatim subtitles, however, 60 00:05:09,920 --> 00:05:13,410 you need to use different models 61 00:05:13,410 --> 00:05:17,140 to assess the quality of the subtitles. 62 00:05:17,140 --> 00:05:20,150 Two main models are used: 63 00:05:20,150 --> 00:05:24,500 the Idea-unit Rendition Assessment, or IRA, 64 00:05:24,500 --> 00:05:29,220 and the Weighed Idea-unit Rendition Assessment, or WIRA. 65 00:05:29,220 --> 00:05:35,420 These two models consider the meaning of each subtitle and not the words. 66 00:05:35,420 --> 00:05:36,840 In the previous example, 67 00:05:36,840 --> 00:05:42,520 the same quality rate would result from the application of both the IRA or WIRA. 68 00:05:42,520 --> 00:05:45,390 However, the rationale is totally different. 69 00:05:45,390 --> 00:05:50,310 Even a sentence expressing the same meaning with different words 70 00:05:50,310 --> 00:05:54,860 and a different structure would result in 0 errors. 71 00:05:55,950 --> 00:06:01,130 Regardless of the model used to assess errors, 72 00:06:01,130 --> 00:06:05,190 errors are normally categorised in a similar way, 73 00:06:05,190 --> 00:06:09,550 be they recognition errors or edition errors, 74 00:06:09,550 --> 00:06:15,620 meaning errors caused by the machine or by a misuse of the machine. 75 00:06:16,613 --> 00:06:18,824 Mistakes can be minor mistakes, 76 00:06:18,824 --> 00:06:22,320 as in the case of George Bush being written with a lowercase B. 77 00:06:23,630 --> 00:06:28,950 Such errors are rarely corrected by the live subtitler or by the live editor, 78 00:06:28,950 --> 00:06:31,880 as they are easy to understand. 79 00:06:31,880 --> 00:06:36,670 Standard errors are less easy to understand, 80 00:06:36,670 --> 00:06:41,620 as they result in very different, though phonetically similar, words. 81 00:06:41,620 --> 00:06:44,590 In some cases, you can still guess the original meaning 82 00:06:44,590 --> 00:06:47,160 as in the case of the «European You Neon», 83 00:06:47,160 --> 00:06:51,120 and a subtitler may decide not to correct them. 84 00:06:51,120 --> 00:06:56,430 In other cases, this is not an option as understanding is much harder, 85 00:06:56,430 --> 00:07:00,860 as in the case of respeaker being recognized 86 00:07:00,860 --> 00:07:04,420 as the surname Rees plus the noun peaker. 87 00:07:05,410 --> 00:07:09,340 In this last case, readers cannot understand the meaning of the subtitle 88 00:07:09,340 --> 00:07:13,460 but at least can tell the subtitle contains a mistake. 89 00:07:14,390 --> 00:07:19,140 When readers cannot tell that the subtitles contain a mistake, 90 00:07:19,140 --> 00:07:24,720 then we have major errors, or misleading mistakes, 91 00:07:24,720 --> 00:07:29,040 as in the case of the sentence: «I'm a bespeaker» 92 00:07:29,040 --> 00:07:31,150 instead of «I'm a respeaker». 93 00:07:31,150 --> 00:07:36,330 In this case, the sentence contains something which is wrong. 94 00:07:38,080 --> 00:07:39,200 Generally speaking, 95 00:07:39,200 --> 00:07:42,800 acceptable verbatim subtitles are considered subtitles 96 00:07:42,800 --> 00:07:46,530 containing less than 2% of mistakes. 97 00:07:46,530 --> 00:07:50,460 Sensatim subtitles are considered of acceptable quality 98 00:07:50,460 --> 00:07:55,270 when they contain less than 5% of mistakes. 99 00:07:55,270 --> 00:07:59,730 In the Intersteno word championships of fast writing, 100 00:07:59,730 --> 00:08:06,260 0.5% is the maximum number of mistakes allowed to competitors. 101 00:08:12,070 --> 00:08:17,090 Speed is usually considered as the most important thing to assess 102 00:08:17,090 --> 00:08:20,220 when assessing real-time subtitles. 103 00:08:21,050 --> 00:08:27,320 For a subtitler, to train their speed is extremely important. 104 00:08:27,320 --> 00:08:32,040 You can do so by doing two types of training: 105 00:08:32,040 --> 00:08:36,490 a Text-to-Text training and a Speech-to-Text training. 106 00:08:36,490 --> 00:08:38,400 Training with TAKI 107 00:08:38,400 --> 00:08:44,180 is a very good way of training one's speed, 108 00:08:44,180 --> 00:08:48,340 because it is a tool provided by Intersteno 109 00:08:48,340 --> 00:08:54,010 that allows you to copy a written text at the speed you want. 110 00:08:54,010 --> 00:08:59,330 The Speech-to-Text skills are better trained 111 00:08:59,330 --> 00:09:04,050 with the Speech Capturing test, which is an Intersteno competition 112 00:09:04,050 --> 00:09:07,180 that allows you to write down what you hear. 113 00:09:16,350 --> 00:09:19,190 For a real-time intralingual subtitler, 114 00:09:19,190 --> 00:09:24,140 however, it is not enough to be accurate and rapid. 115 00:09:24,140 --> 00:09:27,050 It is fundamental to reach Mars! 116 00:09:28,230 --> 00:09:32,560 Better, it is important to reach one’s MARS, 117 00:09:32,560 --> 00:09:38,200 acronym for Most Accurate and Rapid Speech-to-text dictation rate. 118 00:09:39,450 --> 00:09:40,580 In other words, 119 00:09:40,580 --> 00:09:46,420 MARS is one’s capacity to produce accurate subtitles at a given speech rate. 120 00:09:46,420 --> 00:09:52,260 A real time subtitler has to know his or her MARS as after this threshold, 121 00:09:52,260 --> 00:09:57,010 subtitles are likely to be of less good quality. 122 00:09:57,010 --> 00:10:02,520 After one’s MARS, professionals are to start thinking of exit strategies. 123 00:10:02,520 --> 00:10:06,450 A minimum professional standard is 100 words per minute, 124 00:10:06,450 --> 00:10:09,420 corresponding to 500 characters per minute, 125 00:10:09,420 --> 00:10:14,900 not including the necessary voice commands to dictate punctuation. 126 00:10:14,900 --> 00:10:16,810 This may sound a lot, 127 00:10:16,810 --> 00:10:21,460 but real-life events can easily reach 200 words per minute. 128 00:10:21,460 --> 00:10:23,710 This means it is extremely important 129 00:10:23,710 --> 00:10:28,860 to constantly and consistently monitor one's MARS. 130 00:10:28,860 --> 00:10:32,620 To calculate it, we have developed a tool that tells you 131 00:10:32,620 --> 00:10:38,030 how many characters per minute or words per minute each of us can respeak 132 00:10:38,030 --> 00:10:40,740 by keeping an acceptable error rate. 133 00:10:40,740 --> 00:10:45,160 Go to reachmars.eu and enjoy! 134 00:10:50,400 --> 00:10:53,970 In this video lecture, we have dealt with the last dictation skill, 135 00:10:53,970 --> 00:10:57,600 which is at the core of quality assurance. 136 00:10:57,600 --> 00:11:02,650 In particular, we have seen how to understand different quality assurance systems, 137 00:11:02,650 --> 00:11:05,580 mainly verbatim and sensatim, 138 00:11:05,580 --> 00:11:10,570 depending on criteria like accuracy, delay, errors and speed. 139 00:11:10,570 --> 00:11:13,340 Finally, we have seen how to reach MARS, 140 00:11:13,340 --> 00:11:16,900 or a respeaker's capacity to subtitle at a given speed 141 00:11:16,900 --> 00:11:20,270 while keeping the error rate as low as possible.